125 research outputs found

    Income Thresholds and Income Classes

    Get PDF
    This paper proposes a method for detecting income classes based on the change-point problem. There is an increasing demand for such a method in the literature. Computation of polarization indices requires a pre-grouping of the incomes. Similarly, indices of social exclusion and sometimes indices of income inequality require detection of thresholds. The estimation procedure is implemented using a bootstrap technique. Finally, an application of the method to EU member states and to the United States is also considered.income distribution, change-point, thresholds.

    Functional Data Representation with Merge Trees

    Full text link
    In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees and this representation is compared with that offered by persistence diagrams. We show that these two tree structures, although not equivalent, are both invariant under homeomorphic re-parametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We employ a novel metric for merge trees and we prove a few theoretical results related to its specific implementation when merge trees represent functions. To showcase the good properties of our topological approach to functional data analysis, we first go through a few examples using data generated {\em in silico} and employed to illustrate and compare the different representations provided by merge trees and persistence diagrams, and then we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data

    Hierarchical independent component analysis: A multi-resolution non-orthogonal data-driven basis

    Get PDF
    A new method named Hierarchical Independent Component Analysis is presented, particularly suited for dealing with two problems regarding the analysis of high-dimensional and complex data: dimensional reduction and multi-resolution analysis. It takes into account the Blind Source Separation framework, where the purpose is the research of a basis for a dimensional reduced space to represent data, whose basis elements represent physical features of the phenomenon under study. In this case orthogonal basis could be not suitable, since the orthogonality introduces an artificial constraint not related to the phenomenological properties of the analyzed problem. For this reason this new approach is introduced. It is obtained through the integration between Treelets and Independent Component Analysis, and it is able to provide a multi-scale non-orthogonal data-driven basis. Furthermore a strategy to perform dimensional reduction with a non orthogonal basis is presented and the theoretical properties of Hierarchical Independent Component Analysis are analyzed. Finally HICA algorithm is tested both on synthetic data and on a real dataset regarding electroencephalographic traces

    On the role of statistics in the era of big data: A call for a debate

    Get PDF
    While discussing the plenary talk of Dunson (2016) at the 48th Scientific Meeting of the Italian Statistical Society, I formulated a few general questions on the role of statistics in the era of big data which stimulated an interesting debate. They are reported here with the aim of engaging a larger audience on an issue which promises to change radically our discipline and, more generally, science as we know it. But is it so

    Object Oriented Geostatistical Simulation of Functional Compositions via Dimensionality Reduction in Bayes spaces

    Get PDF
    We address the problem of geostatistical simulation of spatial complex data, with emphasis on functional compositions (FCs). We pursue an object oriented geostatistical approach and interpret FCs as random points in a Bayes Hilbert space. This enables us to deal with data dimensionality and constraints by relying on a solid geometric basis, and to develop a simulation strategy consisting of: (i) optimal dimensionality reduction of the problem through a simplicial principal component analysis, and (ii) geostatistical simulation of random realizations of FCs via an approximate multivariate problem.We illustrate our methodology on a dataset of natural soil particle-size densities collected in an alluvial aquifer

    A Class-Kriging predictor for Functional Compositions with Application to Particle-Size Curves in Heterogeneous Aquifers

    Get PDF
    This work addresses the problem of characterizing the spatial field of soil particle-size distributions within a heterogeneous aquifer system. The medium is conceptualized as a composite system, characterized by spatially varying soil textural properties associated with diverse geomaterials. The heterogeneity of the system is modeled through an original hierarchical model for particle-size distributions that are here interpreted as points in the Bayes space of functional compositions. This theoretical framework allows performing spatial prediction of functional compositions through a functional compositional Class-Kriging predictor. To tackle the problem of lack of information arising when the spatial arrangement of soil types is unobserved, a novel clustering method is proposed, allowing to infer a grouping structure from sampled particle-size distributions. The proposed methodology enables one to project the complete information content embedded in the set of heterogeneous particle-size distributions to unsampled locations in the system. These developments are tested on a field application relying on a set of particle-size data observed within an alluvial aquifer in the Neckar river valley, in Germany
    corecore